Systems and methods for interpreting natural language search queries

ABSTRACT

Systems and methods are described herein for interpreting natural language search queries that account for contextual relevance of words of the search query that would ordinarily not be processed, including, for example, processing each word of the query. Each term or phrase is associated with a respective part of speech, and a frequency of occurrence of a combination of adjacent terms or phrases public domain is determined. A relevance of each term is then determined based on its respective type of term and frequency of occurrence in the public domain. The natural language search query is then interpreted based on the importance or relevance of each term.

BACKGROUND

This disclosure relates to processing search queries and, moreparticularly, interpreting natural language search queries.

SUMMARY

With the spread of smart devices, users are more frequently enteringsearch queries using natural language. Such queries are often difficultto decipher and can lead to an unsatisfactory final result for the user.Typically, natural language search queries are normally processed bysimply applying a filter, such as content type, to the query, andreturning any results that match the query within that filter. However,many natural language search queries include words that are contextuallyrelevant to the search query but are ignored or not given appropriateweight by the processing system because they are not associated with anykeyword or genre by themselves. Further, the terms adjacent to eachother may not be given proper weight, as each term may be grouped into asearch domain, for example, a keyword, content type, genre, and a searchfor that term may only be done within that domain. Thus, the results ofthe search do not provide content for which the user was searching andrequire additional searching.

Systems and methods are described herein for interpreting naturallanguage search queries that account for contextual relevance of wordsof the search query that would ordinarily not be processed, including,for example, by processing each word of the query. A natural languagesearch query is received, either as a voice input, a text input, or atranscribed voice-to-text input, and a plurality of terms in the naturallanguage search query are identified. Each term is associated with arespective term type, for example, a keyword, a genre, or a contenttype. Further, the system determines in the natural language searchquery a term type for each term or phrase. In some embodiments, thequery may include a phrase, or a number of phrases that are two or moreconsecutive set of phrases that may occur at any point in the query. Forexample, the term or phrase may be at the beginning, at the middle or atthe end of the query. For each term or phrase, a term type may bedetermined and a search performed within the domain of the term type.For example, if the term type is determined to be a genre, the search isperformed within the genre domain. In some embodiments, adjacent termsmay be determined to be different term types, for example, the naturallanguage search query having a first term or phrase that is a keyword,and a second term or phrase that is a genre. In such an example, wherethe term types adjacent to each other are different term types, atypical system will search for each term within its term types which maylead to limited results. On the other hand, the system may search foreach respective term or phrase to determine if the terms or phrasesshould be merged or combined with another term based on the term type toimprove the search result. The system performs a search in the publicdomain for the search terms or phrases to determine if they occur intandem. The system determines a frequency of occurrence of each term incontent metadata and based on the frequency, may update the metadata ofeach term with the new term type. Further, the system may determine arelational score of the first term to the second term based on afrequency of occurrence for the first term and the second term in thepublic domain. In response to the determined relational score for thefirst phrase and the second phrase, updating the first term or phraseand the second term or phrase, in the context of the first term type.The natural language search query is then interpreted based on therelevance of each term, for example, a relation of the first term to thesecond term from the public domain. Search results are retrieved basedon the interpreted search query, and the results are then generated fordisplay.

For example, a search query for “I would like to watch a democraticdrama movie” may be received, and the user may intend to search formovies that include a “democratic drama,” or in which a democratic dramais a major plot point. At the same time, the word “movie” may indicatethe desired type of content, for example, tv series, short clip, andmovies. The word “drama” may normally be identified as a genreindicating genre type, for example, drama, comedy, romantic comedy,thriller, fiction. The word “democratic” may be identified as a keywordand may be associated with any actor or other identifying informationthat could narrow a search for keywords to those that are aboutdemocratic movies or have democratic as a major plot point. However, thesystem processes the word “democratic” as a keyword and the word “drama”as a genre and performs a search for the terms based on each term type.For example, the system searches for the keyword “democratic” among alisting of keywords and searches for the genre “drama” among a listingof genres. Based on this data, the word “democratic” is marked as akeyword, the word “drama” is marked as a genre. The search query isinterpreted as a query for genres as drama and movies whose metadatacontain the word “democratic” such as in a plot summary. Other examplesmay include searches for “civil war documentary movies.” These searchesidentify the primary type of content (“movies”) but the remaining wordsdo not match up with any preexisting content identifiers that wouldallow for a meaningful search—identifying terms such as “civil war” asuncommon search terms result in a determination that the term isrelevant to the search query.

The natural language interpreter may also be trained using a trainingdata set compiled from previous natural language searches that have beenannotated. A frequency of occurrence for each term in the training dataset is determined in relation to the entire training data set. Arelational data structure is generated that associates each term in thetraining data with its respective frequency. Any term that has afrequency below a threshold frequency is then added to a list ofrelevant words. When a natural language search query is received, aplurality of terms in the natural language search query are identifiedand compared with the list of relevant words. If any term of the naturallanguage search query is included in the relevant words list, that termis identified as a keyword. The natural language search query is theninterpreted based on any identified keywords. As above, search resultsare retrieved based on the interpreted search query and generated fordisplay to the user.

For example, the training data may include a total of ten thousandwords, and the threshold frequency may be one percent. Thus, if wordsappear in the training data less than one hundred times, then that wordsare added to the relevant words list. Using the above example, therelevance of the word “democratic” and “drama” can be determined bychecking if the word “democratic” and “drama” appear on the relevantwords list. If the word “democratic” and “drama” appears on the relevantwords list, then it is identified as a keyword, and the natural languagesearch query is interpreted as a query for movies whose metadata containthe words “democratic” and “drama,” such as in a plot summary.

In another example, the training data may include a total of tenthousand phrases, a combination of two or more words, and the thresholdfrequency may be one percent. Thus, if a phrase appears in the trainingdata less than one hundred times, then that phrase is added to therelevant phrase list. Using the above example, the relevance of the word“democratic drama” can be determined by checking if the phrase“democratic drama” appears on the relevant phrase list. If the word“democratic drama” appears on the relevant words list, then it isidentified as a keyword, and the natural language search query isinterpreted as a query for movies whose metadata contain the words“democratic drama,” such as in a plot summary.

Further, a search of the public domain may be performed for a phrase, acombination of two or more words, and the threshold of occurrencefrequency may be a certain value. Thus, if a phrase appears in thepublic domain more than 10 times or another predetermined occurrencevalue, then that phrase is added to the relevant phrase list. Using theabove example, the relevance of the phrase “democratic drama” can bedetermined by checking if the phrase “democratic drama” in the publicdomain, for example, in different news sources, publications, or anycombination of searchable metadata accessible from a search. If thephrase “democratic drama” appears in the public domain more frequentlythan the predetermined value, then it is identified as a keyword, andthe natural language search query is interpreted as a query for movieswhose metadata contain the words “democratic drama,” such as in a plotsummary.

The natural language search query can also be interpreted through theuse of machine learning, such as using one or more neural networks.After identifying a number of terms in the natural language searchquery, a vector is generated for each term describing a relationshipbetween each term and a plurality of other terms. Each vector is theninputted into a trained neural network that generates an output based onthe input vectors. The natural language search query is then interpretedbased on the output of the neural network. For example, for the query“democratic drama movies,” a vector for “democratic” may be generatedthat represents degrees of connection between democratic and other termsin the natural language search query. A vector for “democratic” may begenerated that represents degrees of connection between “democratic” andother terms, such as “political” and “election.” In some examples, avector for “democratic drama” may be generated that represents degreesof connection between democratic drama and other terms. A vector for“democratic drama” may be generated that represents degrees ofconnection between “democratic drama” and other terms, such as“democratic theater,” “political theater,” and “democratic theater.”These vectors may be input into a neural network that processes eachinput vector and outputs an interpretation of the search query. Forexample, the vector for democratic drama may return the House of Cardsseries as a democratic drama and may indicate a connection with ademocratic theater, where a Hamilton the musical movie, both House ofCards and Hamilton, have a connection to “democratic drama.” The neuralnetwork may then output an interpretation of the search query thatfocuses on the series “House of Cards” and the movie “Hamilton.”Identifying terms such as “democratic” and “drama” as uncommon searchterms results in a determination that the term is relevant to the searchquery, and a vector connecting these terms to a particular content type,for example, a movie, would result in an interpretation that accountsfor the relevance of the term. Search results are then retrieved basedon the interpreted search query and generated for display to the user.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other objects and advantages of the disclosure will beapparent upon consideration of the following detailed description, takenin conjunction with the accompanying drawings, in which like referencecharacters refer to like parts throughout, and in which:

FIG. 1 shows a system for interpreting a natural language search query,in accordance with some embodiments of the disclosure;

FIG. 2 shows a system for interpreting a natural language search query,in accordance with some embodiments of the disclosure;

FIG. 3 shows a system for interpreting a natural language search query,in accordance with some embodiments of the disclosure;

FIG. 4 is a block diagram showing components and data flow therebetweenof a device for interpreting a natural language search query, inaccordance with some embodiments of the disclosure;

FIG. 5 is a flowchart representing a process for interpreting a naturallanguage search query, in accordance with some embodiments of thedisclosure;

FIG. 6 is a flowchart representing a process for interpreting a naturallanguage search query, in accordance with some embodiments of thedisclosure; and

FIG. 7 is a flowchart representing a process for identifying a pluralityof terms in a natural language search query, in accordance with someembodiments of the disclosure.

DETAILED DESCRIPTION

Methods and systems are disclosed herein for interpreting naturallanguage search queries that account for contextual relevance of wordsof the search query that would ordinarily not be processed, including,for example, processing each word or phrase of the query. As referred toherein, a query is any input from the user and may be a voice query, atext query or a query comprising any other input form. A query maycomprise a command, by which the user expects the computing device toperform a certain action (such as “play a movie”). A query may comprisea question, to which the user expects the computing device to provide ananswer (such as “What is the tallest building in New York?”). Thedisclosed methods and systems may be implemented on a computing device.As referred to herein, the computing device can be any device comprisinga processor and memory, for example, a television, a Smart TV, a set-topbox, an integrated receiver decoder (IRD) for handling satellitetelevision, a digital storage device, a digital media receiver (DMR), adigital media adapter (DMA), a streaming media device, a DVD player, aDVD recorder, a connected DVD, a local media server, a BLU-RAY player, aBLU-RAY recorder, a personal computer (PC), a laptop computer, a tabletcomputer, a WebTV box, a personal computer television (PC/TV), a PCmedia server, a PC media center, a handheld computer, a stationarytelephone, a personal digital assistant (PDA), a mobile telephone, aportable video player, a portable music player, a portable gamingmachine, a smartphone, or any other television equipment, computingequipment, or wireless device, and/or combination of the same.

The methods and/or any instructions for performing any of theembodiments discussed herein may be encoded on computer-readable media.Computer-readable media includes any media capable of storing data. Thecomputer-readable media may be transitory, including, but not limitedto, propagating electrical or electromagnetic signals, or may benon-transitory including, but not limited to, volatile and non-volatilecomputer memory or storage devices such as a hard disk, floppy disk, USBdrive, DVD, CD, media cards, register memory, processor caches, RandomAccess Memory (“RAM”), etc.

FIG. 1 shows a first system for interpreting a natural language searchquery, in accordance with some embodiments of the disclosure. Naturallanguage search query 100 may be received from a user or from an inputdevice. Voice-user interface 102 may capture spoken words representingthe natural language search uttered by the user and transmit a digitalrepresentation of the spoken natural language search query to a userdevice 104. User device 104 processes the words of the natural languagesearch query and transmits the search to the server 110 forinterpretation. The user device splits natural language search query 100into multiple word phrase 148, based on grammatical structure, into aplurality of terms 124 a-124 c using natural language processing. Userdevice 102 associates each term with a type of term. Term 124 a(“democratic”) is identified as a keyword; term 124 b (“drama”) isidentified as a filter trigger, in this case a genre, indicating that atleast one term that follows the filter trigger should be applied as afilter to the query; and term 124 c (“movies”) is identified as acontent type. Based on these term types, user device 102 generates andperforms a search to determine if the terms as designated by the termtypes provide the best result. User device 104 may transmit the phrasequery 148, via a communications network 108, to server 110, whichperforms a search based on the phrase in the public domain and send theresults back to user device 104. The device determines if the terms asthey were entered into the query may be interpreted using different termtypes. The system determines that two consecutive sets of terms orphrases may happen anywhere in the query. The system may combine thephrases or terms to perform a search in the public domain. The systemperforms a search for the combined phrase, including a first-term orphrase and a second term or phrase in the public domain 150, forexample, news, science, history, or any other suitable content. Eventhough the term drama is designated as a genre, the system performs asearch of “democratic drama” as keywords to determine if the terms areused in any public domain as such. The number of occurrences of such aphrase with the first term or phrase and second term or phrase isdetermined. Based on the number of occurrences exceeding a threshold ofthe number of occurrences, the phrase is interpreted as keywords. As aresult, a relevance value is determined for the first and second term.

User device 104 may also request or retrieve metadata describing contentitems from server 110 and use it to determine the relevance of each wordor term of the natural language search query. Interpretation 106 of thenatural language search query may be based on the relevance of each wordor term to the subsequent word or term. For example, natural languagesearch query 100 may be the words “democratic drama movies.” User device104 determines, based on the metadata, that the words “democratic” and“drama” are determined as keyword and genre, respectively, and areinfrequent words, and must therefore be relevant to the query. Userdevice 104 also determines that the word “movies” is a content type forwhich a search should be performed. Based on this information, userdevice 104 interprets the natural language search query and generates acorresponding query in a format that can be understood by server 110,such as an SQL “SELECT” command. The command shown in interpretation 106is an SQL command to select all records from a “movies” table of acontent database where any of a summary, a title, a plot synopsis ormetadata contains the words “democratic drama.” Various examples ofinterpreting natural language search queries are described in U.S.patent application Ser. Nos. 16/807,415, 16/807,419, 16/807,421, and16/807,422 filed on Mar. 3, 2020, which are hereby incorporated byreference herein in their entirety.

FIG. 2 shows a second system for interpreting a natural language searchquery, in accordance with some embodiments of the disclosure. In someembodiments, a training data set 200 may be provided to or accessed by,server 202. Server 202 may use the training data set to determine a listof relevant words or phrases 204. A natural language search query 206 isreceived, via voice-user interface 208, at user device 210. User device210 may request or retrieve, via communications network 212, relevantwords list 204 from server 202. User device 210 may compare each word orterm or phrase of the natural language search query 206 to relevantwords list 204 to determine relevant words or terms of the naturallanguage search query 206. The relevant words of the natural languagesearch query 206 are identified as keywords, and user device 210interprets natural language search query 206 based on the identifiedkeywords. For example, natural language search query 206 may be thewords “democratic drama movies.” Based on the training data 200, server202 determines that “democratic” and “drama” are infrequent words andadd them to the relevant words list 204. The user device may combine amultiple word query if it determines that such a combination is widelyused in the public domain. For example, the user device 210 may performa search for a natural language search query 206 that includes two termsas keywords and determine that such terms should be combined to providedesired results. In another example, user device 210 compares the words“democratic” and “drama” to the relevant words list 204 received fromserver 202 and finds that the word “democratic” and “drama” are includedtherein. The terms may be included as a single word or term result ormay be a combination as a phrase. Based on this, user device 210identifies “democratic” and “drama” as a keyword. As above, user device210 identifies the word “movies” as a content type for which a queryshould be performed and generates interpretation 214, which may be anSQL command as described above.

FIG. 3 shows a third system for interpreting a natural language searchquery, in accordance with some embodiments of the disclosure. Naturallanguage search query 300 is received by user device 302. For example,natural language search query 300 may be the phrase “democratic dramamovie.” The user device splits natural language search query 300 into aplurality of terms 304 a-304 c using natural language processing. Userdevice 302 associates each term with a type of term. Term 304 a(“democratic”) is identified as a keyword; term 304 b (“drama”) isidentified as a filter trigger, in this case a genre, indicating that atleast one term that follows the filter trigger should be applied as afilter to the query; and term 304 c (“movies”) is identified as acontent type. Based on these associations, user device 302 generatesinterpretation 306, such as an SQL command, in which terms 304 a-304 care included as corresponding portions 308 a-308 c of the SQL command.Portion 308 d, which initializes a search query, corresponds to theaction of searching for the terms; portion 308 c, which identifies whatrecords to select and from which table the records should be selected,corresponds to term 304 c; portion 308 a, which initializes a firstkeyword, corresponds to term 304 a; portion 308 b, which represents asecond keyword, corresponds to second term 304 b; and portion 308 frepresents the search parameters performed on the content metadata.

FIG. 4 is a block diagram representing components of a computing deviceand data flow therebetween of a device for interpreting a naturallanguage search query, in accordance with some embodiments of thedisclosure. A natural language search query may be received as voiceinput 400 using voice-user interface 402. Voice-user interface 402 mayinclude a microphone or other audio capture device capable of capturingraw audio data and may convert raw audio data into a digitalrepresentation of voice input 400. Voice-user interface 402 may alsoinclude a data interface, such as a network connection using ethernet orWiFi, a Bluetooth connection, or any other suitable data interface forreceiving digital audio from another input device. Voice-user interface402 transmits 404 the digital representation of the voice input tocontrol circuitry 406, where it is received using natural languageprocessing circuitry 408. Natural language processing circuitry maytranscribe the audio representing the natural language search query togenerate a corresponding text string or may process the audio datadirectly. Alternatively, a natural language search query may be receivedas text input 410 using text-user interface 412, which may includesimilar data interfaces to those described above in connection withvoice-user interface 402. Text-user interface 412 transmits 414 the textinput 410 to control circuitry 406, where it is received using naturallanguage processing circuitry 408.

Control circuitry 406 may be based on any suitable processing circuitryand comprises control circuits and memory circuits, which may bedisposed on a single integrated circuit or may be discrete components.As referred to herein, processing circuitry should be understood to meancircuitry based on one or more microprocessors, microcontrollers,digital signal processors, programmable logic devices,field-programmable gate arrays (FPGAs), application-specific integratedcircuits (ASICs), etc., and may include a multi-core processor (e.g.,dual-core, quad-core, hexa-core, or any suitable number of cores). Insome embodiments, processing circuitry may be distributed acrossmultiple separate processors or processing units, for example, multipleof the same type of processing units (e.g., two Intel Core i7processors) or multiple different processors (e.g., an Intel Core i5processor and an Intel Core i7 processor).

Natural language processing circuitry 408 identifies a plurality ofterms in the natural language search query. For example, naturallanguage processing circuitry 408 may identify individual words in thenatural language search query using spaces in text input 410 or pausesor periods of silence in voice input 400. Natural language processingcircuitry 408 analyzes a first word and determines whether the firstword can be part of a larger phrase. For example, natural languageprocessing circuitry 408 may request 416 a dictionary or other word listor phrase list from memory 418. Memory 418 may be any device fortemporarily storing electronic data, such as random-access memory, harddrive, solid-state devices quantum storage device, or any other suitablefixed or removable storage device, and/or any combination of the same.

Upon receiving 420 the dictionary or word list or phrase list frommemory 418, natural language processing circuitry 408 determines if thefirst word can be followed by at least a second word. If so, naturallanguage processing circuitry 408 analyzes the first word together withthe word immediately following the first word to determine if the twowords together form a phrase. If so, the phrase is identified as asingle term in the natural language search query. Otherwise, the firstword alone is identified as a single term in the natural language searchquery.

Once the terms of the natural language search query have beenidentified, natural language processing circuitry 408 associates eachterm with a type of term. Natural language processing circuitry 408 alsodetermines a frequency with which each term occurs. For example, naturallanguage processing circuitry 408 may request 422 metadata describing aplurality of content items from content metadata 424. Natural languageprocessing circuitry 408 receives 426 the requested metadata anddetermines how many occurrences of each term there are in the metadataas a percentage of the total number of terms in the metadata. Using thepart of speech and frequency of each term, natural language processingcircuitry 408 determines a relevance for each term and interprets thenatural language search query based on the relevance of each term.

Natural language processing circuitry 408 transmits 428 theinterpretation of the natural language search query to queryconstruction circuitry 430, which constructs a search querycorresponding to the natural language search query in a format that canbe understood by, for example, a content database. Query constructioncircuitry 430 transmits 456 a search for the format of the naturallanguage search query to the public content domain database 458 todetermine the relevance of the natural language search query. The queryconstruction circuitry 430 transmits the query to determine if the firstterm and the second term are commonly used in the public domain toconstitute updating the format of the query to a specific term typeshared between the first term and the second term. The public contentdatabase 458 transmits 460 to the query construction circuitry 430, theresult of the search for the occurrence of the terms or words in thenatural language search query. Query construction circuitry 430transmits 432 the constructed search query to transceiver circuitry 434,which transmits 436 the search query to, for example, content database438. Transceiver circuitry 434 may be a network connection such as anEthernet port, WiFi module, or any other data connection suitable forcommunicating with a remote server. Transceiver circuitry 434 thenreceives 440 search results from the content database 438 and transmits442 the search results to output circuitry 444. Output circuitry 444then generates for display 446 the search results. Output circuitry 444may be any suitable display driver or other graphic or video signalprocessing circuitry.

In some embodiments, a training data set is used to determine therelevance of each term. Training data 448 may be processing by controlcircuitry 406 or by a remote server to determine the relevance of aplurality of terms included in the training data. The resulting list ofrelevant terms is transmitted 450 to control circuitry 406, where it isreceived using transceiver circuitry 434. Transceiver circuitry 434transmits 452 the received list of relevant terms to natural languageprocessing circuitry 408 for use in determining the relevance of eachterm in the natural language search query.

FIG. 5 is a flowchart representing a third illustrative process 500interpreting a natural language search query, in accordance with someembodiments of the disclosure. Process 500 may be implemented on controlcircuitry 406. In addition, one or more actions of FIG. 5 may beincorporated into or combined with one or more actions of any otherprocess or embodiment described herein.

At 502, control circuitry (e.g., control circuitry 406) receives anatural language search query. At 504, control circuitry 406, usingnatural language processing circuitry 408, determines whether thenatural language search query comprises a complete sentence. Forexample, natural language processing circuitry 408 may use Hidden MarkovModel or Conditional Random Field algorithms or a grammar engine todetermine the structure of the natural language search query. If thenatural language search query does comprise a complete sentence (“Yes”at 504), then, at 506, control circuitry 406 identifies a plurality ofterms in the natural language search query. This may be accomplishedusing methods described below in connection with FIG. 7.

At 508, control circuitry 406 initializes a counter variable N, settingits value to one, and a variable T representing the total number ofidentified terms. At 510, control circuitry 406, using natural languageprocessing circuitry 408, associates the N^(th) term with a part ofspeech. This may be accomplished using methods described below inconnection with FIG. 7. At 512, control circuitry 406 determines whetherN is equal to T, meaning that all terms of the natural language searchquery have been associated with a type of term. If N is not equal to T(“No” at 512), then, at 514, control circuitry 406 increments the valueof N by one, and processing returns to step 510. If N is equal to T(“Yes” at 512), then, at 516, control circuitry 406 identifies, based onthe sentence structure of the natural language search query, a querytype. For example, if the natural language search query begins with twoterms that are designated as keywords, the query type will be a queryfor content items matching filter parameters contained in the two termsas keywords. At 518, control circuitry 406 interprets the naturallanguage search query, in the context of the query type, based on thetype of term of each of the identified terms.

At 520, control circuitry 406 retrieves search results (e.g., fromcontent database 438) based on the interpreted search query. At 522,control circuitry 406, using output circuitry 444, generates the searchresults for display to the user.

The actions or descriptions of FIG. 6 may be used with any otherembodiment of this disclosure. In addition, the actions and descriptionsdescribed in relation to FIG. 6 may be done in suitable alternativeorders or in parallel to further the purposes of this disclosure.

FIG. 6 is a flowchart representing a first illustrative process 600 forinterpreting a natural language search query, in accordance with someembodiments of the disclosure. Process 600 may be implemented on controlcircuitry 406. In addition, one or more actions of FIG. 6 may beincorporated into or combined with one or more actions of any otherprocess or embodiment described herein.

At 602, control circuitry (e.g., control circuitry 406) receives anatural language search query. At 604, control circuitry 406, usingnatural language processing circuitry 408, identifies a plurality ofterms in the natural language search query. This may be accomplishedusing methods described below in connection with FIG. 7.

At 606, control circuitry 406 initializes a counter variable N, settingits value to one, and a variable T representing the total number ofidentified terms. At 608, control circuitry 406, using natural languagecircuitry 408, associates the N^(th) term of the natural language searchquery with a part of speech. For example, natural language processingcircuitry 408 may access a dictionary or other word list or phrase listto identify a part of speech to which the N^(th) term corresponds. At610, control circuitry 406, using natural language processing circuitry408, determines a frequency with which the N^(th) term occurs inmetadata describing content items. This may be accomplished usingmethods described above in connection with FIG. 5. At 612, naturallanguage processing circuitry 408 determines a relevance for the N^(th)term based on the part of speech and the frequency of the N^(th) term.

At 614, control circuitry 406 determines whether N is equal to T,meaning that all terms of the natural language search query have beenprocessed to determine their respective relevance. If N is not equal toT (“No” at 614), then, at 616, control circuitry 406 increments thevalue of N by one, and processing returns to step 608. If N is equal toT (“Yes” at 614), then, at 618, control circuitry 406 interprets thenatural language search query based on the relevance of each term.

At 620, control circuitry 406 retrieves search results (e.g., fromcontent database 438) based on the interpreted search query. At 422,control circuitry 406, using output circuitry 444, generates the searchresults for display to the user.

The actions or descriptions of FIG. 6 may be used with any otherembodiment of this disclosure. In addition, the actions and descriptionsdescribed in relation to FIG. 6 may be done in suitable alternativeorders or in parallel to further the purposes of this disclosure.

FIG. 7 is a flowchart representing an illustrative process 700 foridentifying a plurality of terms in a natural language search query, inaccordance with some embodiments of the disclosure. Process 700 may beimplemented on control circuitry 406. In addition, one or more actionsof FIG. 7 may be incorporated into or combined with one or more actionsof any other process or embodiment described herein.

At 702, control circuitry (e.g., control circuitry 406), using naturallanguage processing circuitry 408, splits the natural language searchquery into a plurality of words. For example, natural languageprocessing circuitry 408 may identify pauses or periods of silence inaudio data representing the natural language search query and split theaudio data at each period of silence to separate the audio data intoaudio chunks, each representing a single word. Alternatively, naturallanguage processing circuitry 408 may receive the natural languagesearch query as text or may transcribe audio data into correspondingtext. Natural language processing circuitry 408 may then split the textinto individual words in every space.

At 704, control circuitry 406, using natural language processingcircuitry 408, determines whether a word or phrase that may occur at anypart of the natural language search query can be part of a combinedphrase. The natural language search query may include a number ofphrases and terms and any one word or phrase may be searchedindependently or combined for a search of a combined search phrase. Forexample, natural language processing circuitry 408 may access adictionary, word list, or phrase list, and identify any phrases thatbegin with each word or phrase in the natural language processing query.If a phrase beginning with a word or phrase is located (“Yes” at 704),then, at 706, natural language processing circuitry 408 determineswhether the first word and a second word immediately following the firstword form a phrase together. Natural language processing circuitry 408may concentrate the first and second words to form a string representinga possible phrase formed by the first and second words together andcompare the string to the dictionary, word list, or phrase list, asabove. If the first and second words form a phrase together (“Yes” at706), then, at 708, natural language processing circuitry 408 identifiesthe first and second words together as a single phrase. If the first andsecond words do not form a phrase together (“No” at 706) or if the firstword cannot be part of a phrase at all (“No” at 704), then, at 710,natural language processing circuitry 408 identifies the first word as asingle term.

The actions or descriptions of FIG. 7 may be used with any otherembodiment of this disclosure. In addition, the actions and descriptionsdescribed in relation to FIG. 7 may be done in suitable alternativeorders or in parallel to further the purposes of this disclosure.

The processes described above are intended to be illustrative and notlimiting. One skilled in the art would appreciate that the steps of theprocesses discussed herein may be omitted, modified, combined, and/orrearranged, and any additional steps may be performed without departingfrom the scope of the invention. More generally, the above disclosure ismeant to be exemplary and not limiting. Only the claims that follow aremeant to set bounds as to what the present invention includes.Furthermore, it should be noted that the features and limitationsdescribed in any one embodiment may be applied to any other embodimentherein, and flowcharts or examples relating to one embodiment may becombined with any other embodiment in a suitable manner, done indifferent orders, or done in parallel. In addition, the systems andmethods described herein may be performed in real-time. It should alsobe noted that the systems and/or methods described above may be appliedto, or used in accordance with, other systems and/or methods.

What is claimed is:
 1. A computer-implemented method for interpreting anatural language search query, the method comprising using processingcircuitry for: receiving the natural language search query; determining,using natural language processing, that the natural language searchquery comprises a plurality of terms or phrases, each term or phrase ofthe plurality of terms or phrases is associated with a respective termtype; determining, based on a grammatical structure, a first term typeassociated with a first term or phrase of the plurality of terms orphrases and a second term type associated with a second term or phraseof the plurality of terms or phrases, wherein the first term type isdifferent than the second term type; performing a first search for thefirst term or phrase in a public domain and a second search for thesecond term or phrase in the public domain; determining a relation scorebetween the first term or phrase and the second term or phrase based onthe first search and the second search; based on the determined relationscore, interpreting the natural language search query including thefirst term or phrase and the second term or phrase, in a context of thefirst term type; retrieving search results based on the interpretednatural language search query; and generating for display the searchresults.
 2. The method of claim 1, wherein the natural language searchquery is received from an input device.
 3. The method of claim 1,wherein the natural language search query is received as audio data,then the method further comprises transcribing the natural languagesearch query into a plurality of words.
 4. The method of claim 1,wherein determining that the natural language search query comprises aplurality of terms or phrases, further comprises: splitting the naturallanguage search query into a plurality of words; analyzing a first wordof the plurality of words; determining, based on analyzing the firstword, whether the first word can be part of the first term type; inresponse to determining that the first word can be part of the firstterm type, analyzing the first word together with a second word thatimmediately follows the first word; determining, based on analyzing thefirst word together with the second word, whether the first word and thesecond word can be analyzed using the first term type; in response todetermining that the first word and the second word can be analyzedusing the first term type, identifying the first and second word asassociated with the first term type.
 5. The method of claim 1, furthercomprises: determining a respective frequency with which the second termor phrase immediately follows the first term or phrase of the pluralityof terms or phrases in metadata describing a plurality of content items;determining a relevance for each term or phrase of the plurality ofterms or phrases based on its respective term type and frequency; andinterpreting the natural language search query based on the relevance ofeach term or phrase.
 6. The method of claim 5, wherein determining therespective frequency with which the second term or phrase immediatelyfollows the first term or phrase of the plurality of terms or phrases inmetadata describing a plurality of content items comprises: retrievingthe metadata describing the plurality of content items; and counting thetotal occurrences of the second term or phrase immediately following thefirst term or phrase contained in the metadata.
 7. The method of claim1, wherein the term type is selected from a keyword, a genre, andcontent type.
 8. The method of claim 1, further comprises: generating arespective vector for each term of the plurality of terms; accessing aknowledge graph associated with content metadata; identifying aplurality of terms or phrases to which each term or phrase of theplurality of terms or phrases connects in the knowledge graph;calculating a distance between each respective term or phrase and eachterm or phrase connected to the respective term or phrase; andgenerating the vector for each term or phrase based on connections ofeach respective term or phrase and the distance between each respectiveterm or phrase and each term or phrase to which each respective term orphrase is connected.
 9. A system for interpreting a natural languagesearch query, the system comprising control circuitry configured to:receive the natural language search query; determine, using naturallanguage processing, that the natural language search query comprises aplurality of terms or phrases, each term or phrase of the plurality ofterms or phrases is associated with a respective term type; determine,based on a grammatical structure, a first term type associated with afirst term or phrase of the plurality of terms or phrases and a secondterm type associated with a second term or phrase of the plurality ofterms or phrases, wherein the first term type is different than thesecond term type; perform a first search for the first term or phrase ina public domain and a second search for the second term or phrase in thepublic domain; determine a relation score between the first term orphrase and the second term or phrase based on the first search and thesecond search; based on the determined relation score, interpret thenatural language search query including the first term or phrase and thesecond term or phrase, in a context of the first term type; retrievesearch results based on the interpreted the natural language searchquery; and generate for display the search results.
 10. The system ofclaim 9, wherein the natural language search query is received from aninput device.
 11. The system of claim 9, wherein the natural languagesearch query is received as audio data, then the method furthercomprises transcribing the natural language search query into aplurality of words.
 12. The system of claim 9, wherein the controlcircuitry configured to determine that the natural language search querycomprises a plurality of terms, is further configured to: split thenatural language search query into a plurality of words; analyze a firstword of the plurality of words; determine, based on analyzing the firstword, whether the first word can be part of the first term type; inresponse to determining that the first word can be part of the firstterm type, analyze the first word together with a second word thatimmediately follows the first word; determine, based on analyzing thefirst word together with the second word, whether the first word and thesecond word can be analyzed using the first term type; in response todetermining that the first word and the second word can be analyzedusing the first term type, identify the first and second word asassociated with the first term type.
 13. The system of claim 9, thecontrol circuitry is further configured to: determine a respectivefrequency with which the second term or phrase immediately follows thefirst term or phrase of the plurality of terms or phrases in metadatadescribing a plurality of content items; determine a relevance for eachterm or phrase of the plurality of terms or phrases based on itsrespective term type and frequency; and interpret the natural languagesearch query based on the relevance of each term or phrase.
 14. Thesystem of claim 13, wherein the control circuitry configured todetermine the respective frequency with which the second termimmediately follows the first term or phrase of the plurality of termsor phrases in metadata describing a plurality of content items isconfigured to: retrieve the metadata describing the plurality of contentitems; and count the total occurrences of the second term or phraseimmediately following the first term or phrase contained in themetadata.
 15. The system of claim 9, wherein the term type is selectedfrom a keyword, a genre, and content type.
 16. The system of claim 9,the control circuitry is further configured to: generate a respectivevector for each term or phrase of the plurality of terms or phrases;access a knowledge graph associated with content metadata; identify aplurality of terms or phrases to which each term or phrase of theplurality of terms or phrase connects in the knowledge graph; calculatea distance between each respective term or phrase and each term orphrase connected to the respective term or phrase; and generate thevector for each term or phrase based on connections of each respectiveterm or phrase and the distance between each respective term or phraseand each term or phrase to which each respective term or phrase isconnected.
 17. A non-transitory computer-readable medium havingnon-transitory computer-readable instructions encoded thereon forinterpreting a natural language search query that, when executed bycontrol circuitry, cause the control circuitry to: receive the naturallanguage search query; determine, using natural language processing,that the natural language search query comprises a plurality of terms orphrases, each term or phrase of the plurality of terms or phrases isassociated with a respective term type; determine, based on agrammatical structure, a first term type associated with a first term orphrase of the plurality of terms or phrases and a second term typeassociated with a second term or phrase of the plurality of terms orphrase, wherein the first term type is different than the second termtype; perform a first search for the first term or phrase in a publicdomain and a second search for the second term or phrase in the publicdomain; determine a relation score between the first term or phrase andthe second term or phrase based on the first search and the secondsearch; based on the determined relation score, interpret the naturallanguage search query including the first term or phrase and the secondterm or phrase, in a context of the first term type; retrieve searchresults based on the interpreted the natural language search query; andgenerate for display the search results.
 18. The non-transitorycomputer-readable medium of claim 17, wherein the natural languagesearch query is received from an input device.
 19. The non-transitorycomputer-readable medium of claim 17, wherein the natural languagesearch query is received as audio data, wherein the execution of theinstructions further causes the control circuitry to transcribe thenatural language search query into a plurality of words.
 20. Thenon-transitory computer-readable medium of claim 17, wherein executionof the instruction to determine that the natural language search querycomprises a plurality of terms or phrase, is further configured to:split the natural language search query into a plurality of words;analyze a first word of the plurality of words; determine, based onanalyzing the first word, whether the first word can be part of thefirst term type; in response to determining that the first word can bepart of the first term type, analyze the first word together with asecond word that immediately follows the first word; determine, based onanalyzing the first word together with the second word, whether thefirst word and the second word can be analyzed using the first termtype; in response to determining that the first word and the second wordcan be analyzed using the first term type, identify the first and secondword as associated with the first term type.